Lecture #13 


Binary Trees, Cont.‏ ٭ 


— Binary Search Tree Node Deletion 
— Uses for Binary Search Trees 

— Huffman Encoding 

— Balanced Trees 


Binary Trees, Cont. 


Tve FINALLY FouND € SCROLL of 


Binary Tree Review 


Question #1: Is the 
above tree a valid 
binary search tree? 


Question #2: How 
about now? 


Max 


Binary Search Tree Insertion Review 


Question #1: How would you go 
about inserting “Cathy” 


Darren 
John 
Arissa Casey 


Question #2: How would you go 
about inserting “Priyank”. 


Paulene 
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Deleting a Node from a Binary Search Tree 


By simply moving an arbitrary node 
into Darren’s slot, we violate our 
Binary Search Tree ordering 


ene It's not as easy as 


you might think! 


Carey is NOT less than Arissa! 


Next we’ll see how to do this 
properly.... NO i 


Now how do I re-link the سے‎ 
nodes back together? i 


Can | just move Arissa 
into Darren’s old slot? 


Carey John 


Casey 


Paulene 


Hmm.. It seems OK, but Is our 
tree still a valid binary search 
troa? 


Deleting a Node from a Binary Search Tree 


Here’s a high-level algorithm to delete a node 
from a Binary Search Tree: 


Given a value V to delete from the tree: 


1. Find the value V in the tree, with a slightly- 
modified BST search. 
- Use two pointers: a cur pointer & a parent 
pointer 


2. If the node was found, delete it from the tree, 
making sure to preserve its ordering! 
- There are three cases, so be careful! 


This algorithm is very similar to our 
traditional BST searching 0 # 1 
algorithm... Except it also has a 
08 pointe 
Step L.: -z U VO 


>1. parent = nullpe 
—2. cur = root 


=3. While (cur != nullpt 
>A. If (V == cur>valk 


We’d want our parent 
pointer to point to Carey’s 


Every time we move down left or 
right, we advance the parent 
pointer as well! 


ode we want to delete. 


<+ cur 


delete, and parent points to the node 


ashavan itl 


BST Deletion: Step #2 


Once we've found our target node, we have to delete 
it. 


Case 3: 
Our node is a leaf. Our node has 
two children. 


parent 


CUI > 


ZN 


` Step #2, Case #1 - Our Target ۱۵۱۷۹6 is a 


Let’s look at case #1 - it has two sub-cases! 


Case 1, Sub-case #1: 
The target node is NOT the root 
node 
—1. Unlink the parent node from 
the target node (cur) by setting 
the parent’s appropriate link to 


Case 1: 
Our node is a leaf. 


In this case, our target node (cur) is 
our parent node’s right child... 


So we'll set parent->right to NULL to 
unlink the parent and cur. 


Step #2, Case #1 - Our Target Nddeé is a 


Let’s look at case #1 - it has two sub-cases! 


Case 1, Sub-case #1: 
The target node is NOT the root 
node 
1. Unlink the parent node from 
the target node (cur) by setting 
the parent’s appropriate link to 
2. Mért-delete the target (cur) 


node. 
Case 1, Sub-case #2: 
The target node is the root node 


Our target node (cur) jot pointer to NULL. 


that we want to delete is Ite the target (cur) 
the root node! 


Case 1: 
Our node is a leaf. 


— 


| Step #2, Case #2 - Our Target ۳ی ء۰‎ 
Child 


Let’s look at case #2 10۷۷۰۸۰۱۲ also has two sub-cases! 


Case a: Case 2, Sub-case #1: 
ur nade hacore The target node is NOT the root 
child node 
—1. Relink the parent node to the 
parent target (cur) node’s only child. 


. Then delete the target (cur) 
€ 


Our target node (cur) 
that we want to delete is 
NOT the root node! 


* Step #2, Case #2 - Our Target Moena ۱ی‎ 
Child 


Let’s look at case #2 10۷۷۰۸۰۱۲ also has two sub-cases! 


Case 2: 


Our node has one NOT the root 


Our target node (cur) 
that we want to delete is 


the root node! node to the 
f’s only child. 
2. Then delete the target (cur) 


gode. 


Case 2, Sub-case #2: 
The target node is the root node 


chil 


. Relink the root pointer to the 
target (cur) node’s only child. 


. Then delete the target (cur) 
node. 
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Step #2, Case #3 - Our Target 
Children 


Let’s look at case #3 now. The hard one! 


We need to find a replacement for our target 
node that still leaves the BST consistent. 


Case 3: 
Our node has 
two children. We can’t just pick some arbitrary node 


and move it up into the vacated slot! 
parent 


cur 


For instance, what if we tried 
replacing Darren with Arissa? 


Utoh! If we replace Darren with Arissa, 
our BST is no longer valid! 


So, when deleting a node with two 
children, we have to be very careful! 
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step #2, Case #3 BST Deletion: Step #2 
€ nce we've found our node, we have to delete it. 
Notice that To de i “f mR وس‎ eee - 
ourBSTis تا‎ node 
still correct! T that hj] Our node is a leaf. 
اکل‎ child : 
irs نات ٭‎ 


chnique #1 or 


#2! subtree’s 


-valued child 
Why? Our replacement l 
In this case. our node is guaranteed to oick one, 
aOGIS IRE Ssh, Se have either zero or one J} value up, 


we use Case ۰ 


delete that 


» Step #2, Case #3 BST Deletion: Step #2 


٢ Once we've found our node, we have to delete it. 
There are 3 cases. 


Notice that 
our BST is 
still correct! 


Case 2: 
Our node has one 


In this case, our 
node has one child, 
so we use Case 2. 


o Step #2, Case #3 BST Deletion: Step #2 


Once we've found our node, we have to delete it. 
There are 3 cases. 


Case 1: Case 2: 
Our node is a leaf. Our node has one 


Either it has a left child or 
no children at all... 


The same holds true for the 
smallest value in our right 


subtree! 
By definition, it 
simpler deletion ago ms for the can’t have a left 


replacement! Child! 


Deletion Exercise 


Explain how you would go 
about deleting node k. 


Explain how you would go 
about deleting node e. 


à 12 p Explain how you would 
170 o| [a] 90 about deleting node 
7 
le} [i 


Where are Binary Search 


Trees Used? 


Remember the STL map? 


using namespace std; 


main() 


{ 


map<string,float> stud2gpa;‏ ج_ 


-=> stud2gpa[“Carey” ] 3.62; // BST insert! 


+ stud2gpa[“David”] = 3.99; 
-> stud2gpa[“Dalia”] = 4.0; 
stud2gpa[“Carey”] = 2.1; 


} 


It uses a type of binary search tree 
to store the items! 


—> cout << stud2gpa[“David”]; // BST search! 


#include <map> stud2 


a 
pRoot_ NLL 


ID 
vall_ 2.1 | 
left] NULL rightIN ULI 


ID [David] 


val 3.99 | 


Where are Binary Search 
Trees Used? 


The STL set also uses a type of 
گر حا‎ 


#include <set> 
using namespace std; 


main ( ) 


{ 


set<int> // camstruct BST 


a.insert(2)/} insert into BST 
a.insert(3); 
a.insert(4); 
a.insert(2); 
int n; 
n = a.size(); 
a.erase(2);// delete from BST 


The STL set and map use 
binary search trees (a 
special balanced kind) to 
enable fast searching. 


Other STL containers 
like multiset and 
multimap also use 
binary 
search trees. 


These containers can 
have duplicate 
mappings. (Unlike set 
and map) 


Huffman Encoding: 
Applying Trees to Real-World Problems 


Huffman Encoding is a data compression 
technique that can be used to compress and 
decompress files (e.g. like creating ZIP files). 


Background 


Before we actually cover Huffman Encoding, we 
need to learn a few things... 


Remember the ASCII code? 
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ASCII 


Computers represent letters, punctuation and 
digit symbols using the ASCII code, storing each 
character as a number. 


When you type a character on the keyboard, it’s 
converted into a number and stored in the 
computer's memory! 
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The ASCII Chart 


48 


e- aja‏ ات 


جآ E‏ ہل 
FF Ijl Em E‏ 
حابسماط ‏ حاىہ آر اجج 


کس س کے :| +| ل Fc)‏ 
رح AE Fo EI Î‏ 
٭ | ااب صا حا |= 
i | =‏ اد ات ت د وو نہیں 


ہے ر سے کا = 
کت اب ات $H Peo‏ 
ہد | ہے so ae)‏ ا 


96-111 


65 


97 


24 


CONICO ۱۷۱۲۲۱۷ 4 
So basically, characters are stored in 
the computer’s memory as 
ein data 
— char data[7] = “Carey”; 
— ofstream out(“file.dat”) ; 
— out << data; 
—> out.close(); 
} 
= 3 Ê My Documents 
اد‎ m ۱ 3 rly, when you = Edit View Favorites Tools Help 
write data out to a file, 3 ی-3‎ Psach Erodes | HI 
it’s stored as ASCII Address (Û My Documents - _ 8 Norton بج سد‎ 
numbers too! File and Folder Tasks سس یش ہے‎ 


Ç2 Make anew folder 


i 1 Notes 
@ پت‎ this folder to the تا‎ *) ft Word Document 
e 7 KR 


fa? Share this folder 


Bytes and Bits 


Now, as you've probably heard, the computer 
actually stores all numbers as 1’s and 0's (in binary) 
instead of decimal... 


main() 


{ 


char data[7] = “Carey”; 


ofstream out (“file.dat”); 
out << data; 
out.close(); 


Each character is represented by 8 bits. 


Each bit can have a value of either O or 1 
(i.e. 1 = high voltage and 0 = low voltage) 


Binary and Decimal 


Every decimal number has an equivalent binary 
representation (they’re just two ways of representing the 
same thing) 


Decimal Number Binary Equivalent 
0 00000000 
1 00000001 
2 00000010 
3 00000011 
4 00000100 
255 11111111 


So that’s binary... 
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Consider a Data File 
Now lets consider a simple data file containing the data: 
“| AM SAM MAM.” 


As we've learned, this is actually stored as 13 
numbers in our data file: 


73 32 65 77 32 83 65 77 32 77 65 77 46 


And in reality, its rea//y stored in the computer as a 
set of 104 binary digits (bits): 


01001001 00100000 01000001 01001101 00100000 01010011 01000001 
01001101 00100000 01001101 01000001 01001101 00101110 


(13 characters * 8 bits/character = 104 bits) 
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Data Compresion 


So our original string “| AM SAM MAM.” requires 104 
bits to store on our computer... OK. 


01001001 00100000 01000001 01001101 00100000 01010011 01000001 
01001101 00100000 01001101 01000001 01001101 00101110 


The question is: 


Can we somehow reduce the number of bits 
required to store our data? 


And of course, the answer Is YES! 
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Huffman Encoding 


To compress a file “file.dat” with Huffman encoding, 
we use the following steps: 


1. 


2 


Compute the frequency of each character in 
file.dat. 

Build a Huffman tree (a binary tree) based on 
these frequencies. 

Use this binary tree to convert the original file’s 
contents to a more compressed form. 


. Save the converted (compressed) data to a file. 


0 ۲۱١۷۱۱۱٢۱۹۱۱ ۱۱۲٢۷۱۱۱۱١۰ Step 
#1 


Step #1: Compute the FILE.DAT 
a +0 ۸۸ئ۰‎ 2۸۰۰ 
character in file.dat. : — - 
(i.e. Compute a 
histogram) 


‘A’ 
۴ 
1M’ 
“G7 
Space 3 
Period 1 


ہیں پر ھ پر 


“Huffman Encoding: step 
#2 


Step #2: Build a Huffman tree (a binary tree) based on 


these frequencies: 
A. Create a binary tree leaf node for each entry in our table, but 
don't insert any of these into a tree! ch 
B. While we have more than one node left: freq 
—# Find the two nodes with lowest freqs. left_right 
—>. Create a new parent node. 

—®. Link the parent to each of the children. 
—+. Set the parent’s total frequency equal to 
the sum of its children’s frequencies. 

- Place the new parent node in our grouping. 


ch| ‘ʻA’ || Ea chf M 
o E ۲٥٢٢ 4 | 


left left left left 


left right! | left right right right right right 


“Huffman Encoding: step 
#2 


bhy? 


B-While we have more than one node left: 
1. Find the two nodes with lowest fregs. 
2. Create a t node. 
3. Link the ach of the children. 
4. Set the pi left right al frequency equal to 


the sum ¢ en's frequencies. left right| | left right 
nt node in our grouping. 


—>. Place the Mew par 


ch| ‘A’ || Ee chf M 


E 
freq freq freq 


freq 


left right left right! | left right! | left right! | left right 
سد‎ 
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left 
NULL 


Human Encoding: step 
#2 


left right 


Ea] ES 


freq 
left right! | left right! left right! | left right 


left_right 
NULLINULL 


“ Huffman Encoding: step 
chi | 


freq 
left right 


ch 
freq 


left Ta left right, | left right 
NULEWULL] [NUL NUL 
Ok. Now we have a 


left right} single binary tree. 
NULUNULL 


left right 


ch{_‘.’_ | | ch 
freq freq 


left right, | left right 
NULLINULLI WNULLINULL 


AuTTMan Encoding: step‏ *٭ 
#2 


Step #2: Build a Huffman tree (a binary tree) based on 
these frequencies: 


A. Create a binary tree leaf node for each entry in our table, but 
don’t insert any of these into a tree! 
B. Build a binary tree from the leaves. 


C. Now label each left edge with a 00” and 
each right edge with a ‘IL” 


II NULINU LU 


“ Huffman Encoding: step 
#2 


Now we can determine the new 
bit-encoding for each character. 


The bit encoding for a character 
is the path of O’s and 1’s that 
you take from the root of the 
tree to the character of interes fê 


For example: 


S is encoded as 0001 
A is encoded as 10 
M is encoded as 1 -i Em Notice that characters 


that occurred more often 
in our message have 
shorter bit-encodings! 


Etc... NULINULL 


۲۱1١۷۱۱٢۱٢۲۱۹۱٢۱ ۱۱۴۷۷۱۱۱۱١۰ Step 
#3 


Step #3: Use this binary tree to convert the original 
file’s contents to a more compressed form.. 


i.e. find the sequence of bits (1s red T7] 


and Os) for each char in the 


message. 


AMI SAMM 


0011110011 DOO 1 
110110010000 


= ۲٦١3۷۱۱۱٢۱۹۱۱ ۱۱۱۷۸۷۱۱۱١۰ Step 
#4 


ep #4: Save the converted (compressed) data to a file. 


0011110011 DOO ILOO 11 10 1100 0000 


compressed.dat 


Notice that our new file Is less 
than four bytes or 31 bits long! 


Our original file is 13 


bytes or 104 bits long! 
nalfile.da 


We saved over 69%! 


Ok... So | cheated a bit... 


its impossible to interpret the file! 


Did 000 equal “I” or did 000 equal 
“Q”? Or was it 00 equals ۶< 


So, we must add some additional 
data to the top of our compressed 
file to specify the encoding we 


used... ۱ 
Now clearly this adds some 


overhead to our file... 


But usually there’s a pretty big 
Savings anyway! 


If all we have is our 31 bits of data... 
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Decoding... 


—> 1. Extract the encoding scheme from the compressed file. 
—>2. Build a Huffman tree (a binary tree) based on the 
— encodings. 

3. Use this binary tree to convert the compressed file’s 
—> contents back to the original characters. 


compressed ENIN poco ene 


| AM_SAM MAM. 


output.dat 


(When you hit a 
nt leaf node, output 
ch and then start 
again at the root!) 


Balanced Search Trees 


Question: 
What happens if we insert the following values 
into a binary search tree? 


5,10, 7, 9, 8, 20, 18, 17, 16, 15, 14, 13, 12, 11 


Right! We get an unbalanced tree! 


Question: 
What is the approximate big-oh cost of searching 
for a value in this tree? 


O(N)... YUCK! 
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Balanced Search Trees 


In real life, BSTs often 
end up looking just like 
our example, especially 

after repeated 
insertions and 


deletions. 
It'd be nice if we could 


come up with an 
improved BST ADT that 
always maintains its 
balance. 
This would ensure that all 
insertions, searches and 
deletions would be O(log n). 


Balanced Search Trees 


Well, guess what? 
CS nerds have come to the rescue! 


They've invented numerous improved binary search 
tree ADTs like 2-3 Trees, Red-Black Trees, and AVL 
Trees. 


These BST variations work (mostly) 
just like a regular binary search tree... 


but every time you add/delete a value, they 
automatically shift the nodes around so the tree is 
balanced! 
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Balancing a Tree On Insertion 


For example, the AVL Tree tracks the height of ALL subtrees in the 
BST. 


T 


fhr 


2 


e And its right 
subtree 
22 subtree has a 


height height of 3 too... 


Of 3... 
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Balancing a Tree On Insertion 


For example, the AVL Tree tracks the height of ALL subtrees in the 
BST. 


Or Ae 
Ae The AVL tree tracks these 


subtree height values for 


every node in the tree! 
{ga And its right 

subtree has a 

height of 4... 


Its left subtree 
has a height 
of 3... 
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Balancing a Tree On Insertion 


For example, the AVL Tree tracks the height of ALL subtrees in the 


BST. 
After an insertion/deletion, if the height of the subtrees 
under any node is different by more than one level... 


ound to maintain 


` 


Then the AVL algorit 


So the difference 
‘ween J's two 
ees is 2 levels! 


While the shifting looks 
complex (it is []), it only 
takes O(log n) time! 


<+ 
So with just a little extra 
work, the tree Is always 
balanced and can always 
be searched in log n 
time! 


But now its right 
subtree has a 
height of 5! 


Balanced Search Trees 


You don’t need to know the gory details of any of 
these balanced BSTs for your final or projects. 


Just remember, that balanced BS IS are 
always O(log,n) for insertion and deletion. 


And if you’re ever in a job/internship interview 
and are asked a BST question... 


Always make sure to ask the interviewer 
if you may assume the BST is balanced! 


It could make or break your interview! 


